{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Small World Graphs\n",
    "\n",
    "Code examples from [Think Complexity, 2nd edition](http://greenteapress.com/wp/complexity2), Chapter 3\n",
    "\n",
    "Copyright 2016 Allen Downey, [MIT License](http://opensource.org/licenses/MIT)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from __future__ import print_function, division\n",
    "\n",
    "%matplotlib inline\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import networkx as nx\n",
    "import numpy as np\n",
    "\n",
    "import thinkplot\n",
    "\n",
    "# colors from our friends at http://colorbrewer2.org\n",
    "COLORS = ['#8dd3c7','#ffffb3','#bebada','#fb8072','#80b1d3','#fdb462',\n",
    "          '#b3de69','#fccde5','#d9d9d9','#bc80bd','#ccebc5','#ffed6f']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from thinkstats2 import RandomSeed\n",
    "RandomSeed(17)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Regular ring lattice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To make a ring lattice, I'll start with a generator function that yields edges between each node and the next `halfk` neighbors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def adjacent_edges(nodes, halfk):\n",
    "    \"\"\"Yields edges between each node and `halfk` neighbors.\n",
    "    \n",
    "    halfk: number of edges from each node\n",
    "    \"\"\"\n",
    "    n = len(nodes)\n",
    "    for i, u in enumerate(nodes):\n",
    "        for j in range(i+1, i+halfk+1):\n",
    "            v = nodes[j % n]\n",
    "            yield u, v"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can test it with 3 nodes and `halfk=1`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nodes = range(3)\n",
    "for edge in adjacent_edges(nodes, 1):\n",
    "    print(edge)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we use `adjacent_edges` to write `make_ring_lattice`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def make_ring_lattice(n, k):\n",
    "    \"\"\"Makes a ring lattice with `n` nodes and degree `k`.\n",
    "    \n",
    "    Note: this only works correctly if k is even.\n",
    "    \n",
    "    n: number of nodes\n",
    "    k: degree of each node\n",
    "    \"\"\"\n",
    "    G = nx.Graph()\n",
    "    nodes = range(n)\n",
    "    G.add_nodes_from(nodes)\n",
    "    G.add_edges_from(adjacent_edges(nodes, k//2))\n",
    "    return G"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can test it out with `n=10` and `k=4`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lattice = make_ring_lattice(10, 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nx.draw_circular(lattice, \n",
    "                 node_color=COLORS[0], \n",
    "                 node_size=1000, \n",
    "                 with_labels=True)\n",
    "plt.savefig('chap03-1.pdf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** To see how this function fails when `k` is odd, run it again with `k=2` or `k=5`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## WS graph"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To make a WS, you start with a ring lattice and then rewire."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def make_ws_graph(n, k, p):\n",
    "    \"\"\"Makes a Watts-Strogatz graph.\n",
    "    \n",
    "    n: number of nodes\n",
    "    k: degree of each node\n",
    "    p: probability of rewiring an edge\n",
    "    \"\"\"\n",
    "    ws = make_ring_lattice(n, k)\n",
    "    rewire(ws, p)\n",
    "    return ws"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's the function that does the rewiring"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from numpy.random import choice\n",
    "\n",
    "def rewire(G, p):\n",
    "    \"\"\"Rewires each edge with probability `p`.\n",
    "    \n",
    "    G: Graph\n",
    "    p: float\n",
    "    \"\"\"\n",
    "    nodes = set(G)\n",
    "    for edge in G.edges():\n",
    "        if flip(p):\n",
    "            u, v = edge\n",
    "            choices = nodes - {u} - set(G[u])\n",
    "            new_v = choice(tuple(choices))\n",
    "            G.remove_edge(u, v)\n",
    "            G.add_edge(u, new_v)\n",
    "            \n",
    "def flip(p):\n",
    "    \"\"\"Returns True with probability `p`.\"\"\"\n",
    "    return np.random.random() < p"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's an example with `p=0.2`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ws = make_ws_graph(10, 4, 0.2)\n",
    "nx.draw_circular(ws, \n",
    "                 node_color=COLORS[1], \n",
    "                 node_size=1000, \n",
    "                 with_labels=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just checking that we have the same number of edges we started with:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "len(lattice.edges()), len(ws.edges())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now I'll generate a plot that shows WS graphs for a few values of `p`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "n = 10\n",
    "k = 4\n",
    "ns = 40\n",
    "\n",
    "thinkplot.preplot(cols=3)\n",
    "ws = make_ws_graph(n, k, 0)\n",
    "nx.draw_circular(ws, node_size=ns)\n",
    "thinkplot.config(axis='equal')\n",
    "\n",
    "thinkplot.subplot(2)\n",
    "ws = make_ws_graph(n, k, 0.2)\n",
    "nx.draw_circular(ws, node_size=ns)\n",
    "thinkplot.config(axis='equal')\n",
    "\n",
    "thinkplot.subplot(3)\n",
    "ws = make_ws_graph(n, k, 1.0)\n",
    "nx.draw_circular(ws, node_size=ns)\n",
    "thinkplot.config(axis='equal')\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.subplots_adjust(wspace=0, hspace=0, left=0, right=1)\n",
    "plt.savefig('chap03-2.pdf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** What is the order of growth of `rewire`?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "\"\"\"The loop executes once for each edge.  Inside the loop, everything is constant\n",
    "time except computing `choices`, which is linear in `n`.  So the total run time is \n",
    "`O(nm)`.\"\"\";"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Clustering"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following function computes the local clustering coefficient for a given node, `u`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def node_clustering(G, u):\n",
    "    \"\"\"Computes local clustering coefficient for `u`.\n",
    "    \n",
    "    G: Graph\n",
    "    u: node\n",
    "    \n",
    "    returns: float\n",
    "    \"\"\"\n",
    "    neighbors = G[u]\n",
    "    k = len(neighbors)\n",
    "    if k < 2:\n",
    "        return 0\n",
    "        \n",
    "    total = k * (k-1) / 2\n",
    "    exist = 0    \n",
    "    for v, w in all_pairs(neighbors):\n",
    "        if G.has_edge(v, w):\n",
    "            exist +=1\n",
    "    return exist / total\n",
    "\n",
    "def all_pairs(nodes):\n",
    "    \"\"\"Generates all pairs of nodes.\"\"\"\n",
    "    for i, u in enumerate(nodes):\n",
    "        for j, v in enumerate(nodes):\n",
    "            if i < j:\n",
    "                yield u, v"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The network average clustering coefficient is just the mean of the local CCs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def clustering_coefficient(G):\n",
    "    \"\"\"Average of the local clustering coefficients.\n",
    "    \n",
    "    G: Graph\n",
    "    \n",
    "    returns: float\n",
    "    \"\"\"\n",
    "    cc = np.mean([node_clustering(G, node) for node in G])\n",
    "    return cc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In a ring lattice with `k=4`, the clustering coefficient for each node should be 0.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lattice = make_ring_lattice(10, 4)\n",
    "node_clustering(lattice, 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And the network average should be 0.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "clustering_coefficient(lattice)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Correct."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%timeit clustering_coefficient(lattice)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** Write a version of `node_clustering` that replaces the `for` loop with a list comprehension.  Is it faster?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "def node_clustering(G, u):\n",
    "    neighbors = G[u]\n",
    "    k = len(neighbors)\n",
    "    if k < 2:\n",
    "        return 0\n",
    "        \n",
    "    edges = [G.has_edge(v, w) for v, w in all_pairs(neighbors)]\n",
    "    return np.mean(edges)\n",
    "\n",
    "clustering_coefficient(lattice)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%timeit clustering_coefficient(lattice)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** What is the order of growth of `clustering_coefficient` in terms of `n`, `m`, and `k`?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "\"\"\"`clustering_coefficient` calls `node_clustering` once for each node.  \n",
    "`node_clustering` is quadratic in `k`, the number of neighbors.\n",
    "\n",
    "In a complete graph, `k = n-1`, so `node_clustering` is `O(n^2)` and \n",
    "`clustering_coefficient` is `O(n^3)`.\n",
    "\n",
    "But in a ring lattice, or any other graph where `k` is not proportional to `n`, \n",
    "`clustering_coefficient` is `O(k^2 n)`.\n",
    "\"\"\";"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Path length"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following function computes path lengths between all pairs of nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def path_lengths(G):\n",
    "    length_iter = nx.shortest_path_length(G)\n",
    "    for source, dist_map in length_iter:\n",
    "        for dest, dist in dist_map.items():\n",
    "            yield dist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The characteristic path length is the mean path length for all pairs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def characteristic_path_length(G):\n",
    "    return np.mean(list(path_lengths(G)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On a complete graph, the average path length should be 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "complete = nx.complete_graph(10)\n",
    "characteristic_path_length(complete)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On a ring lattice with `n=1000` and `k=10`, the mean is about 50"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lattice = make_ring_lattice(1000, 10)\n",
    "characteristic_path_length(lattice)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:**  What is the mean path length in a ring lattice with `n=10` and `k=4`?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "lattice = make_ring_lattice(10, 4)\n",
    "characteristic_path_length(lattice)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The experiment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This function generates a WS graph with the given parameters and returns a pair of (mean path length, clustering coefficient):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def run_one_graph(n, k, p):\n",
    "    \"\"\"Makes a WS graph and computes its stats.\n",
    "    \n",
    "    n: number of nodes\n",
    "    k: degree of each node\n",
    "    p: probability of rewiring\n",
    "    \n",
    "    returns: tuple of (mean path length, clustering coefficient)\n",
    "    \"\"\"\n",
    "    ws = make_ws_graph(n, k, p)    \n",
    "    mpl = characteristic_path_length(ws)\n",
    "    cc = clustering_coefficient(ws)\n",
    "    print(mpl, cc)\n",
    "    return mpl, cc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With `n=1000` and `k=10`, it takes about a second on my computer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%time run_one_graph(1000, 10, 0.01)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we'll run it with a range of values for `p`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ps = np.logspace(-4, 0, 9)\n",
    "print(ps)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This function runs each value of `p` 20 times and returns a dictionary that maps from each `p` to a list of (mpl, cc) pairs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def run_experiment(ps, n=1000, k=10, iters=20):\n",
    "    \"\"\"Computes stats for WS graphs with a range of `p`.\n",
    "    \n",
    "    ps: sequence of `p` to try\n",
    "    n: number of nodes\n",
    "    k: degree of each node\n",
    "    iters: number of times to run for each `p`\n",
    "    \n",
    "    returns: sequence of (mpl, cc) pairs\n",
    "    \"\"\"\n",
    "    res = {}\n",
    "    for p in ps:\n",
    "        print(p)\n",
    "        res[p] = []\n",
    "        for _ in range(iters):\n",
    "            res[p].append(run_one_graph(n, k, p))\n",
    "    return res"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the raw results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "res = run_experiment(ps)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we have to extract them in a form we can plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L = []\n",
    "C = []\n",
    "for p, t in sorted(res.items()):\n",
    "    mpls, ccs = zip(*t)\n",
    "    mpl = np.mean(mpls)\n",
    "    cc = np.mean(ccs)\n",
    "    L.append(mpl)\n",
    "    C.append(cc)\n",
    "    \n",
    "print(L)\n",
    "print(C)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And normalize them so they both start at 1.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L = np.array(L) / L[0]\n",
    "C = np.array(C) / C[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's the plot that replicates Watts and Strogatz's Figure 2."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "thinkplot.plot(ps, L, style='o-', linewidth=1)\n",
    "thinkplot.plot(ps, C, style='s-', linewidth=1)\n",
    "thinkplot.text(0.001, 0.9, 'C(p) / C(0)')\n",
    "thinkplot.text(0.0005, 0.25, 'L(p) / L(0)')\n",
    "thinkplot.config(xlabel='p', xscale='log',\n",
    "                 title='Normalized clustering coefficient and path length',\n",
    "                 xlim=[0.00009, 1.1], ylim=[-0.01, 1.01])\n",
    "plt.savefig('chap03-3.pdf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Breadth-first search"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's see how the shortest path algorithm works.  We'll start with BFS, which is the basis for Dijkstra's algorithm.\n",
    "\n",
    "Here's our old friend, the ring lattice:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lattice = make_ring_lattice(10, 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nx.draw_circular(lattice, \n",
    "                 node_color=COLORS[2], \n",
    "                 node_size=1000, \n",
    "                 with_labels=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And here's my implementation of BFS using a deque."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import deque\n",
    "\n",
    "def reachable_nodes_bfs(G, start):\n",
    "    \"\"\"Finds reachable nodes by BFS.\n",
    "    \n",
    "    G: graph\n",
    "    start: node to start at\n",
    "    \n",
    "    returns: set of reachable nodes\n",
    "    \"\"\"\n",
    "    seen = set()\n",
    "    queue = deque([start])\n",
    "    while queue:\n",
    "        node = queue.popleft()\n",
    "        if node not in seen:\n",
    "            seen.add(node)\n",
    "            queue.extend(G.neighbors(node))\n",
    "    return seen"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It works:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "reachable_nodes_bfs(lattice, 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's a version that's a little faster, but maybe less readable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def reachable_nodes_bfs(G, start):\n",
    "    \"\"\"Finds reachable nodes by BFS.\n",
    "    \n",
    "    G: graph\n",
    "    start: node to start at\n",
    "    \n",
    "    returns: set of reachable nodes\n",
    "    \"\"\"\n",
    "    seen = set()\n",
    "    queue = deque([start])\n",
    "    while queue:\n",
    "        node = queue.popleft()\n",
    "        if node not in seen:\n",
    "            seen.add(node)\n",
    "            neighbors = set(G[node]) - seen\n",
    "            queue.extend(neighbors)\n",
    "    return seen"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It works, too."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "reachable_nodes_bfs(lattice, 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dijkstra's algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we're ready for Dijkstra's algorithm, at least for graphs where all the edges have the same weight/length."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def shortest_path_dijkstra(G, start):\n",
    "    \"\"\"Finds shortest paths from `start` to all other nodes.\n",
    "    \n",
    "    G: graph\n",
    "    start: node to start at\n",
    "    \n",
    "    returns: make from node to path length\n",
    "    \"\"\"\n",
    "    dist = {start: 0}\n",
    "    queue = deque([start])\n",
    "    while queue:\n",
    "        node = queue.popleft()\n",
    "        new_dist = dist[node] + 1\n",
    "\n",
    "        neighbors = set(G[node]).difference(dist)\n",
    "        for n in neighbors:\n",
    "            dist[n] = new_dist\n",
    "        \n",
    "        queue.extend(neighbors)\n",
    "    return dist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, we'll test it on a ring lattice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lattice = make_ring_lattice(10, 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nx.draw_circular(lattice, \n",
    "                 node_color=COLORS[3], \n",
    "                 node_size=1000, \n",
    "                 with_labels=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's my implementation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d1 = shortest_path_dijkstra(lattice, 0)\n",
    "d1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And here's the result from NetworkX:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d2 = nx.shortest_path_length(lattice, 0)\n",
    "d2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "They are the same:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d1 == d2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "**Exercise:** In a ring lattice with `n=1000` and `k=10`, which node is farthest from 0 and how far is it?  Use `shortest_path_dijkstra` to check your answer.\n",
    "\n",
    "Note: the maximum distance between two nodes is the **diameter** of the graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "lattice = make_ring_lattice(1000, 10)\n",
    "d = shortest_path_dijkstra(lattice, 0)\n",
    "\n",
    "from operator import itemgetter\n",
    "node, dist = sorted(d.items(), key=itemgetter(1))[-1]\n",
    "node, dist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** In a ring lattice, every node has the same number of neighbors.  The number of neighbors is called the **degree** of the node, and a graph where all nodes have the same degree is called a **regular graph**.\n",
    "\n",
    "All ring lattices are regular, but not all regular graphs are ring lattices.  In particular, if `k` is odd, we can't construct a ring lattice, but we might be able to construct a regular graph.\n",
    "\n",
    "Write a function called `make_regular_graph` that takes `n` and `k` and returns a regular graph that contains `n` nodes, where every node has `k` neighbors.  If it's not possible to make a regular graph with the given values of `n` and `k`, the function should raise a `ValueError`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# Here's `adjacent_edges` again for comparison:\n",
    "\n",
    "def adjacent_edges(nodes, halfk):\n",
    "    n = len(nodes)\n",
    "    for i, u in enumerate(nodes):\n",
    "        for j in range(i+1, i+halfk+1):\n",
    "            v = nodes[j % n]\n",
    "            yield u, v"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# And here's a function that computes edges that connect each\n",
    "# node to the one half-way around the circle\n",
    "\n",
    "def opposite_edges(nodes):\n",
    "    \"\"\"Enumerates edges that connect opposite nodes.\"\"\"\n",
    "    n = len(nodes)\n",
    "    for i, u in enumerate(nodes):\n",
    "        j = i + n//2\n",
    "        v = nodes[j % n]\n",
    "        yield u, v"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# Now we can make regular graphs.\n",
    "\n",
    "def make_regular_graph(n, k):\n",
    "    \"\"\"Makes graph with `n` nodes where all nodes have `k` neighbors.\n",
    "    \n",
    "    Not possible if both `n` and `k` are odd.\n",
    "    \"\"\"\n",
    "    # a is the number of adjacent edges\n",
    "    # b is the number of opposite edges (0 or 1)\n",
    "    a, b = divmod(k, 2)\n",
    "    \n",
    "    G = nx.Graph()\n",
    "    nodes = range(n)\n",
    "    G.add_nodes_from(nodes)\n",
    "    G.add_edges_from(adjacent_edges(nodes, a))\n",
    "    \n",
    "    # if k is odd, add opposite edges\n",
    "    if b:\n",
    "        if n%2:\n",
    "            msg = \"Can't make a regular graph if n and k are odd.\"\n",
    "            raise ValueError(msg)\n",
    "        G.add_edges_from(opposite_edges(nodes))\n",
    "    return G"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# Here's an example.\n",
    "\n",
    "regular = make_regular_graph(10, 3)\n",
    "\n",
    "nx.draw_circular(regular, \n",
    "                 node_color=COLORS[4], \n",
    "                 node_size=1000, \n",
    "                 with_labels=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "**Exercise:** My implementation of `reachable_nodes_bfs` is efficient in the sense that it is in $O(n + m)$, but it incurs a lot of overhead adding nodes to the queue and removing them.  NetworkX provides a simple, fast implementation of BFS, available from [the NetworkX repository on GitHub](https://github.com/networkx/networkx/blob/master/networkx/algorithms/components/connected.py).\n",
    "\n",
    "Here is a version I modified to return a set of nodes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plain_bfs(G, source):\n",
    "    \"\"\"A fast BFS node generator\"\"\"\n",
    "    seen = set()\n",
    "    nextlevel = {source}\n",
    "    while nextlevel:\n",
    "        thislevel = nextlevel\n",
    "        nextlevel = set()\n",
    "        for v in thislevel:\n",
    "            if v not in seen:\n",
    "                seen.add(v)\n",
    "                nextlevel.update(G[v])\n",
    "    return seen"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare this function to `reachable_nodes_bfs` and see which is faster.  Then see if you can modify this function to implement a faster version of `shortest_path_dijkstra`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "lattice = make_ring_lattice(1000, 10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "%timeit len(reachable_nodes_bfs(lattice, 0))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "%timeit len(plain_bfs(lattice, 0))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "#The version from NetworkX is substantially faster!\n",
    "\n",
    "#Here's a version of Dijkstra's algorithm that works the same way:\n",
    "\n",
    "def plain_shortest_path(G, source):\n",
    "    \"\"\"A fast version of Dijkstra's algorithm for equal edges.\"\"\"\n",
    "    new_dist = 0\n",
    "    dist = {}\n",
    "    nextlevel = {source}\n",
    "    while nextlevel:\n",
    "        thislevel = nextlevel\n",
    "        nextlevel = set()\n",
    "        for v in thislevel:\n",
    "            if v not in dist:\n",
    "                dist[v] = new_dist\n",
    "                nextlevel.update(G[v])\n",
    "        new_dist += 1\n",
    "    return dist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "#It gets the right answers\n",
    "\n",
    "lattice = make_ring_lattice(1000, 10)\n",
    "d1 = shortest_path_dijkstra(lattice, 0)\n",
    "d2 = plain_shortest_path(lattice, 0)\n",
    "d1 == d2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# And it is substantually faster than the version that uses a deque.\n",
    "\n",
    "%timeit shortest_path_dijkstra(lattice, 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "%timeit plain_shortest_path(lattice, 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "%timeit nx.shortest_path_length(lattice, 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** The following implementation of a BFS contains two performance errors.  What are\n",
    "they?  What is the actual order of growth for this algorithm?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def bfs(top_node, visit):\n",
    "    \"\"\"Breadth-first search on a graph, starting at top_node.\"\"\"\n",
    "    visited = set()\n",
    "    queue = [top_node]\n",
    "    while len(queue):\n",
    "        curr_node = queue.pop(0)    # Dequeue\n",
    "        visit(curr_node)            # Visit the node\n",
    "        visited.add(curr_node)\n",
    "\n",
    "        # Enqueue non-visited and non-enqueued children\n",
    "        queue.extend(c for c in curr_node.children\n",
    "                     if c not in visited and c not in queue)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "\"\"\"The first performance error is using `pop(0)` on a list, which is linear in\n",
    "the length of the list.  The second error is checking whether the children are \n",
    "in queue, which is also linear in the length of the list.  In the worst case, \n",
    "a completely connected graph, the queue loop runs `n` times, and each time we \n",
    "have to check `n` nodes to see if they are in a list with `n` elements, so the \n",
    "total run time is `O(n^3)`, which is really terrible.\n",
    "\n",
    "By the way, I did not make this example up.  It used to be on \n",
    "[the Wikipedia page for BFS](https://en.wikipedia.org/wiki/Breadth-first_search).\n",
    "In fact, if you search the Internet for Python implementations of BFS, many of \n",
    "them contain at least one performance error.\n",
    "\"\"\"\n",
    "None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Exercise:** In the book, I claimed that Dijkstra's algorithm does not work unless it uses BFS.  Write a version of `shortest_path_dijkstra` that uses DFS and test it on a few examples to see what goes wrong."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Solution\n",
    "\n",
    "# Here's the broken version:\n",
    "\n",
    "def shortest_path_dfs(G, start):\n",
    "    dist = {start: 0}\n",
    "    queue = deque([start])\n",
    "    while queue:\n",
    "        node = queue.pop()\n",
    "        new_dist = dist[node] + 1\n",
    "\n",
    "        neighbors = set(G[node]).difference(dist)\n",
    "        for n in neighbors:\n",
    "            dist[n] = new_dist\n",
    "        \n",
    "        queue.extend(neighbors)\n",
    "    return dist\n",
    "\n",
    "#Sure enough, it gets the answers wrong\n",
    "\n",
    "lattice = make_ring_lattice(10, 4)\n",
    "d1 = shortest_path_dfs(lattice, 0)\n",
    "print(d1)\n",
    "d2 = nx.shortest_path_length(lattice, 0)\n",
    "print(d2)\n",
    "d1 == d2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
